Efficiently Updating Cost Repository Values for Query Optimization on Web Data Sources in a Mediator/Wrapper Environment

نویسندگان

  • Justo Hidalgo
  • Alberto Pan
  • Manuel Álvarez
  • Jaime Guerrero
چکیده

Optimizing accesses to sources in a mediator/wrapper environment is a critical need. Due to a variety of reasons, relational-based optimization techniques are of no use when having to handle HTTP-based web sources, so new approaches which take into account client/server communication costs must be devised. This paper describes a cost model that stores values from a complete set of web source-focused parameters obtained by the web wrappers, by using a novel updating technique that handles the values measured by the wrappers in previous query executions, and generates a new model instance in each new iteration with an efficient processing cost. This instance allows rapid value updates caused by changes of the server quality or bandwidth, so typical in this context. The results of these techniques are demonstrated both theoretically and by means of an implementation showing how performance improves in real-world web sources when compared to classical approaches.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Leveraging Mediator Cost Models with Heterogeneous Data Sources

Distributed systems require declarative access to diverse information sources. One approach to solving this heterogeneous distributed database problem is based on mediator architectures. In these architectures, mediators accept queries from users, process them with respect to wrappers, and return answers. Wrapper provide access to underlying sources. To eeciently process queries, the mediator m...

متن کامل

Searching and Querying Wide-Area Distributed Collections

The rapid proliferation of widely-distributed data and document collections raises the need for wrapper/mediator archi-tectures that can handle the challenges of wide area query processing. Traditional query and search techniques do not scale to large numbers of repositories and cannot cope with the unpredictable performance and (un)availability of access to such repositories. Research at the U...

متن کامل

Validating Mediator Cost Models with Disco

Disco is a mediator system developed at INRIA for accessing heteroge neous data sources over the Internet In Disco mediators accept queries from users process them with respect to wrappers and return answers Wrapper provide access to underlying sources To e ciently process queries the mediator performs cost based query optimization In a heterogeneous distributed database cost estimate based que...

متن کامل

QUERY PROCESSING OVER INCOMPLETE AUTONOMOUS WEB DATABASES by Hemal Khatri

Incompleteness due to missing attribute values (aka “null values”) is very common in autonomous web databases, on which user accesses are usually supported through mediators. Traditional query processing techniques that focus on the strict soundness of answer tuples often ignore tuples with critical missing attributes, even if they wind up being relevant to the user query. Ideally, the mediator...

متن کامل

Integration of Heterogeneous Data Sources with Limited Capabilities in the Object-Oriented Mediator Engine AMOS II

Information becomes a more and more valuable asset in today’s organizations. Therefore the need of creating an integrated view over all available data sources arises. Several technical problems must be overcome in the design and implementation of a system for integrating different data sources. To the main obstacles count autonomy, data heterogeneity and different query capabilities of the repo...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006